首页> 外文OA文献 >WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads
【2h】

WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads

机译:WhatsHap:用于下一代测序的加权单倍型装配

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The human genome is diploid, which requires assigning heterozygous single nucleotide polymorphisms (SNPs) to the two copies of the genome. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for downstream analyses in population genetics. Currently, statistical approaches, which are oblivious to direct read information, constitute the state-of-the-art. Haplotype assembly, which addresses phasing directly from sequencing reads, suffers from the fact that sequencing reads of the current generation are too short to serve the purposes of genome-wide phasing. While future-technology sequencing reads will contain sufficient amounts of SNPs per read for phasing, they are also likely to suffer from higher sequencing error rates. Currently, no haplotype assembly approaches exist that allow for taking both increasing read length and sequencing error information into account.\udHere, we suggestWhatsHap, the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs. \udWhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. We demonstrate that WhatsHap can handle datasets of coverage up to 203, and that 153 are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates. We also find that the switch and flip error rates of the haplotypes we output are favorable when comparing them with state-of-the-art statistical phasers.
机译:人类基因组是二倍体,需要将杂合的单核苷酸多态性(SNP)分配给基因组的两个副本。产生的单倍型,即属于每个拷贝的SNP列表,对于下游群体遗传学分析至关重要。当前,忽略了直接阅读信息的统计方法构成了最新技术。单倍型组装直接解决了测序读取中的定相问题,因为当前一代的测序读物太短而无法满足全基因组定相的目的。虽然未来技术的测序读段每个读段将包含足够数量的SNP以进行定相,但它们也可能会遭受更高的测序错误率。当前,不存在允许同时考虑增加读取长度和测序错误信息的单倍型组装方法。\ ud在此,我们建议WhatsHap,这是第一种方法,可对运行时间线性加权的最小错误校正问题产生可证明的最优解。 SNP。 \ udWhatsHap是将覆盖率作为参数的固定参数可处理(FPT)方法。我们证明,WhatsHap可以处理覆盖范围最大为203个的数据集,并且即使在显着提高的测序错误率下,153个也足以可靠地定相长读。我们还发现,当将它们与最新的统计相位器进行比较时,我们输出的单倍型的开关和翻转错误率是有利的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号